-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GraphBolt] Add ItemSet/Dict4
#7382
Conversation
To trigger regression tests:
|
|
||
__all__ = ["ItemSet", "ItemSetDict"] | ||
__all__ = ["ItemSet", "ItemSetDict", "ItemSet4", "ItemSetDict4"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more meaningful name instead of ItemSet4
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a temporary name since it will be replaced right away.
|
||
class ItemSet4(Dataset): | ||
r"""Class for iterating over tensor-like data. | ||
Experimental. Implemented only __getitem__() accepting slice and list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring and examples to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the high-level design doc here.
|
||
def __init__( | ||
self, | ||
items: Union[torch.Tensor, Mapping, Tuple[Mapping]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need Mapping
? we do need int
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm considering to disable
int
, becauseint
doesn't contain anydtype
info. To generate anall_nodes_set
we can always use a tensor scalar (which is how we do it now). What's your opinion? Mapping
is indded to "large-scope", but I'm trying to differentiateItemSet4
from existingItemSet
which takes inIterable
. I thinkSequence
might be a good choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/library/collections.abc.html#collections.abc.Sequence
Sequence
best matches our design in my opinion.
@@ -442,3 +443,188 @@ def __repr__(self) -> str: | |||
itemsets=itemsets_str, | |||
names=self._names, | |||
) | |||
|
|||
|
|||
class ItemSet4(Dataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just removing __iter__
from existing ItemSet/Dict
and inherit from Dataset
only? Other code are directly copied? If yes, could we just modify upon the existing one directly? will it break existing ItemSampler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just simply copying existing ItemSet
. But I can try to modify upon the existing code directly.
Description
Add
ItemSet/Dict4
that will later replace currentItemSet/Dict
.Only add a few test cases because most of the functionalities already have tests covered above. The code can be reused when replacing.
TODO:
Checklist
Please feel free to remove inapplicable items for your PR.
Changes